EXHIBIT 6 



MOLECULAR BIOLOGY OF 



THE CELL 



Bruce Alberts 
Alexander Johnson 
Julian Lewis 
Martin Raff 
Keith Roberts 
Peter Walter 



Garland Science 



BEST AVAILABLE COPY 



fourth 



edition 




Taylor &. Francis Croup 



Garland 

Vice President: Denise Schanck 

Managing Editor Sarah Gibbs 

Senior Editorial Assistant: Kirsten Jenner 

Managing Production Editor. Emma Hunt 

Proofreader and Layout: Emma Hunt 

Production Assistant Angela Bennett 

Text Editors: Marjorie Singer Anderson and Betsy Dilernia 

Copy Editor: Bruce Goatly 

Word Processors: Fran Dependahl, Misty Landers and Carol Winter 

Designer: Blink Studio, London 

Illustrator: Nigel Orme 

Indexer Janine Ross and Sherry Granum 

Manufacturing: Nigel Eyre and Marion Morrow 



Cell Biology Interactive 

Artistic and Scientific Direction: Peter Walter 

Narrated by: Julie Theriot 

Production, Design, and Development Mike Morales 



Bruce Alberts received bis Ph.D. from Harvard University and is 
President of the National Academy of Sciences and Professor of 
Biochemistry and Biophysics at the University of California, San 
Francisco. Alexander Johnson received his Ph.D. from Harvard 
University and is a Professor of Microbiology and Immunology at 
the University of California, San Francisco. Julian Lewis received 
his D.Phil, from the University of Oxford and is a Principal 
Scientist at the Imperial Cancer Research Fund, London. 
Martin Raff received his M.D. from McGill University and is at the 
Medical Research Council Laboratory for Molecular Cell Biology 
and Cell Biology Unit and in the Biology Department at University 
College London. Keith Roberts received his Ph.D. from the 
University of Cambridge and is Associate Research Director at the 
John Innes Centre, Norwich. Peter Waiter received his Ph.D. from 
The Rockefeller University in New York and is Professor and 
Chairman of the Department of Biochemistry and Biophysics at 
the University of California, San Francisco, and an Investigator of 
the Howard Hughes Medical Institute. 



© 2002 by Bruce Alberts, Alexander Johnson, Julian Lewis, 
Martin Raff, Keith Roberts, and Peter Walter. 
© 1983, 1989, 1994 by Bruce Alberts, Dennis Bray, Julian Lewis, 
Martin Raff, Keith Roberts, and James D.Watson. 



All rights reserved. No part of this book covered by the copyright 
hereon may be reproduced or used In any format in any form or 
by any means— graphic, electronic, or mechanical, including 
photocopying, recording, taping, or information storage and 
retrieval systems— without permission of the publisher! 



library of Congress Cataloging-in-Publicaton Data 
Molecular biology of the cell / Bruce Alberts ... [et all.-- 4th ed. 
P* crn 

Includes bibliographical references and index. 

ISBN 0-8153-3218-1 (hardbound) - ISBN 0-8153-4072-9 (pbk.) 

1. Cytology. 2. Molecular biology. I. Alberts, Bruce. 

[DNLM: 1. Cells. 2. Molecular Biology. 3 
QH581.2.M64 2002 
571.6-dc21 

2001054471 CIP 



Published by Garland Science, a member of the Taylor & Francis Group, 
29 West 35th Street, New York, NY 10001-2299 

Printed in the United States of America 

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 



Front cover Human Genome: Reprinted by permission 
from Nature, International Human Genome Sequencing 
Consortium, 409:860-921, 2001 © MacmiUan Magazines 
Ltd. Adapted from an image by Francis Collins, NHGRI; 
Jim Kent, UCSC; Ewan Birney, EBI; and Darryl Leja, 
NHGRI; showing a portion of Chromosome 1 from the 
initial sequencing of the human genome. 

Back cover In 1967, the British artist Peter Blake created 
a design classic. Nearly 35 years later Nigel Orme 
(illustrator), Richard Denyer (photographer), and the 
authors have together produced an affectionate tribute 
to Mr Blake's image. With its gallery of icons and 
influences, its assembly created almost as much - 
complexity, intrigue and mystery as the original. 
Drosophila, Ambidopsis, Dolly and the assembled 
company tempt you to dip inside where, as in the 
original, "a splendid time is guaranteed for all." 
(Gunter Blobel, courtesy of The Rockefeller University; Marie 
Curie, Keystone Press Agency Inc; Darwin bust, by permission 
of the President and Council of the Royal Society; Rosalind 
Franklin, courtesy of Cold Spring Harbor Laboratory Archives; 
Dorothy Hodgkin, ©The Nobel Foundation, 1964; James Joyce, 
etching by Peter Blake; Robert Johnson, photo booth 
self-portrait early 1930s, © 1986 Delta Haze Corporation all 
rights reserved, used by permission; Albert L. Lehninger, 
(unidentified photographer) courtesy of The Alan Mason 
Chesney Medical Archives of The Johns Hopkins Medical 
Institutions; Linus Pauling, from Ava Helen and Linus Pauling 
Papers, Special Collections, Oregon State University; Nicholas 
Poussin, courtesy of ArrTbdayxom; Barbara McCUntock, 
© David Micklos, 1983; Andrei Sakharov, courtesy of Elena 
Bonner; Frederick Sanger, © The Nobel Foundation, 1958.) 



BEST AVAILABLE COPY 



gene A 



gene B 



DNA 



I TRANSCRIPTION 
^^mmm rna 



TRANSLATION 



TRANSCRIPTION 



i RNA 



TRANSLATION 



FROM DNA TO RNA 

Transcription and translation are the means by which cells read out, or express, 
the genetic instructions in their genes. Because many identical RNA copies can 
be made from the same gene, and each RNA molecule can direct the synthesis 
of many identical protein molecules, cells can synthesize a large amount of 
protein rapidly when necessary. But each gene can also be transcribed and 
translated with a different efficiency, allowing the cell to make vast quantities of 
some proteins and tiny quantities of others (Figure 6-3). Moreover, as we see in 
the next chapter, a cell can change (or regulate) the expression of each of its 
genes according to the needs of the moment— most obviously by controlling 
the production of its RNA. 



Figure 6-3 Genes can be expressed 
with different efficiencies. Gene A is 
transcribed and translated much more 
efficiently than gene B.This allows the 
amount of protein A in the cell to be 
much greater than that of protein B. 



Portions of DNA Sequence Are Transcribed into RNA 

The first step a cell takes in reading out a needed part of its genetic instructions 
is to copy a particular portion of its DNA nucleotide sequence — a gene — into an 
RNA nucleotide sequence. The information in RNA, although copied into another 
chemical form, is still written in essentially the same language as it is in DNA— 
the language of a nucleotide sequence. Hence the name transcription. 

Like DNA, RNA is a linear polymer made of four different types of nucleotide 
subunits linked together by phosphodiester bonds (Figure 6-4). It differs from 
DNA chemically in two 4 respects: (1) the nucleotides in RNA are 
ribonucleotides— that is, they contain the sugar ribose (hence the name ribonu- 
cleic acid) rather than deoxyribose; (2) although, like DNA, RNA contains the 
bases adenine (A), guanine (G), and cytosine (C) y it contains the base uracil (U) 
instead of the thymine (T) in DNA. Since U, like T, can base-pair by hydrogen- 
bonding with A (Figure 6-5), the complementary base-pairing properties 
described for DNA in Chapters 4 and 5 apply also to RNA (in RNA, G pairs with 
C, and A pairs with U). It is not uncommon, however, to find other types of base 
pairs in RNA: for example, G pairing with U occasionally. 

Despite these small chemical differences, DNA and RNA differ quite dra- 
matically in overall structure. Whereas DNA always occurs in cells as a double- 
stranded helix, RNA is single-stranded. RNA chains therefore fold up into a 
variety of shapes, just as a polypeptide chain folds up to form the final shape of 
a protein (Figure 6-6). As we see later in this chapter, the ability to fold into com- 
plex three-dimensional shapes allows some RNA molecules to have structural 
and catalytic functions. 



Transcription Produces RNA Complementary to 
One Strand of DNA 

All of the RNA in a cell is made by DNA transcription, a process that has cer- 
tain similarities to the process of DNA replication discussed in Chapter 5. 
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Figure 6-89 Protein aggregates that cause human disease. (A) Schematic illustration of the type of 
conformational change in a protein that produces material for a cross-beta filament. (B) Diagram illustrating 
the self-infectious nature of the protein aggregation that is central to prion diseases. PrP is highly unusual 
because the misfolded version of the protein, called PrP*, induces the norma! PrP protein it contacts* to 
change its conformation, as shown. Most of the human diseases caused by protein aggregation are caused by 
the overproduction of a variant protein that is especially prone to aggregation, but because this structure Is 
not Infectious In this way, it cannot spread from one animal to another. (Q Drawing of a cross-beta filament 
a common type of protease-resistant protein aggregate found in a variety of human neurological diseases. 
Because the hydrogen-bond interactions in a 0 sheet form between polypeptide backbone atoms (see Figure 
3-9), a number of different abnormally folded proteins can produce this structure. (D) One of several 
possible models for the conversion of PrP to PrP*, showing the likely change of two a-helices into four 
p-sQands. Although the structure of the normal protein has been determined accurately, the structure of the 
infectious form is not yet known with certainty because the aggregation has prevented the use of standard 
structural techniques. (C, courtesy of Louise Serpell, adapted from M. Sunde et al. J. Mol Biol 273:729-739, 
1997; D, adapted from S.B. Prusiner, Trends Biochem. So. 21:482-487, 1996.) 

animals and humans. It can be dangerous to eat the tissues of animals that con- 
tain PrP*, as witnessed most recently by the spread of BSE (commonly referred 
to as the "mad cow disease") from cattle to humans in Great Britain, 

Fortunately in the absence of PrP*, PrP is extraordinarily difficult to convert 
to its abnormal form. Although very few proteins have the potential to misfold 
into an infectious conformation, a similar transformation has been discovered 
to be the cause of an otherwise mysterious "protein-only inheritance" observed 
in yeast cells. 

There Are Many Steps From DNA to Protein 

We have seen so far in this chapter that many different types of chemical reac- 
tions are required to produce a properly folded protein from the information 
contained in a gene (Figure 6-90). The final level of a properly folded protein in 
a cell therefore depends upon the efficiency with which each of the many steps 
is performed. 

We discuss in Chapter 7 that cells have the ability to change the levels of 
their proteins according to their needs. In principle, any or all of the steps in Fig- 
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Figure 6-90 The production of a 
protein by a eucaryotlc cell. The final 
level of each protein in a eucaryotic cell 
depends upon the efficiency of each step 
depicted. 
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ure 6-90) couid be regulated fay the cell for each individual protein. However, as 
we shall see in Chapter 7, the initiation of transcription is the most common 
point for a cell to regulate the expression of each of its genes. This makes sense, 
inasmuch as the most efficient way to keep a gene from being expressed is to 
block the very first step— the transcription of its DNA sequence into an RNA 
molecule. 



Summary 

The translation of the nucleotide sequence of an mRNA molecule into protein takes 
place in the cytoplasm on a large ribonucleoprotein assembly called a ribosome. The 
amino acids used for protein synthesis are first attached to a family of tRNA 
molecules, each of which recognizes, by complementary base-pair interactions, par- 
ticular sets of three nucleotides in the mRNA (codons). The sequence of nucleotides in 
the mRNA is then read from one end to the other in sets of three according to the 
genetic code. 

To initiate translation, a small ribosomal subunit binds to the mRNA molecule 
at a start codon (AUG) that is recognized by a unique initiator tRNA molecule. A 
large ribosomal subunit binds to complete the ribosome and begin the elongation 
phase of protein synthesis. During this phase, aminoacyl tRNAs—each bearing a 
specific amino acid bind sequentially to the appropriate codon in mRNA by forming 
complementary base pairs with the tRNA anUcodon. Each amino acid is added to the 
C-terminal end of the growing polypeptide by means of a cycle of three sequential 
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Figure 7-5 Six steps at which 
eucaryotie gene expression can be 
controlled. Controls that operate at 
steps I through 5 are discussed in. this 
chapter. Step 6, the regulation of protein 
activity, includes reversible activation or 
inactlvation by protein phosphorylation 
(discussed in Chapter 3) as well as 
irreversible inactivation by proteolytic 
degradation (discussed in Chapter 6). 



Gene Expression Can Be Regulated at Many of the Steps 
in the Pathway from DNA to RNA to Protein 

If differences among the various cell types of an organism depend on the partic- 
ular genes that the cells express, at what level is the control of gene expression 
exercised? As we saw in the last chapter, there are many steps in the pathway 
leading from DNA to protein, and all of them can in principle be regulated. Thus 
a cell can control the proteins it makes by (1) controlling when and how often a 
given gene is transcribed (transcriptional control), (2) controlling how the RNA 
transcript is spliced or otherwise processed (RNA processing control), (3) 
selecting which completed mRNAs in the cell nucleus are exported to the cytosol 
and determining where in the cytosol they are localized (RNA transport and 
localization control), (4) selecting which mRNAs in the cytoplasm are translated 
byribosomes (translational control), (5) selectively destabilizing certain mRNA 
molecules in the cytoplasm (mRNA degradation control), or (6) selectively acti- 
vating, inactivating, degrading, or compartmentalizing specific protein 
molecules after they have been made (protein activity control) (Figure 7-5). 

For most genes transcriptional controls are paramount. This makes sense 
because, of all the possible control points illustrated in Figure 7-5, only tran- 
scriptional control ensures that the cell wQl not synthesize superfluous interme- 
diates. In the following sections we discuss the DNA and protein components 
that perform this function by regulating the initiation of gene transcription. We 
shall return at the end of the chapter to the additional ways of regulating gene 
expression. 

Summary 

The genome of a cell contains in its DNA sequence the information to make many 
thousands of different protein and RNA molecules. A cell typically expresses only a 
fraction of its genes, and the different types of cells in multicellular organisms arise 
because different sets of genes are expressed. Moreover, cells can change the pattern 
of genes they express in response to changes in their environment, such as signals 
from other cells. Although all of the steps involved in expressing a gene can in prin- 
ciple be regulated, for most genes the initiation of RNA transcription is the most 
. important point of control 



DNA-BINDING MOTIFS IN GENE REGULATORY 
PROTEINS 

How does a cell determine which of its thousands of genes to transcribe? As 
mentioned briefly in Chapters 4 and 6, the transcription of each gene is con- 
trolled by a regulatory region of DNA relatively near the site where transcription 
begins. Some regulatory regions are simple and act as switches that are thrown 
;by a single signal. Many others are complex and act as tiny microprocessors, 
Responding to a variety of signals that they interpret and integrate to switch the 
^neighboring gene on or off. Whether complex or simple, these switching devices 
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occur in the germ line, the cell lineage that gives rise to sperm or eggs. Most of 
the DNA in vertebrate germ cells is inactive and highly methylated. Over long 
periods of evolutionary time, the methylated CG sequences in these inactive 
regions have presumably been lost through spontaneous deamination events 
that were not properly repaired However promoters of genes that remain active 
in the germ cell lineages (including most housekeeping genes) are kept 
unmethylated, and therefore spontaneous deaminations of Cs that occur with- 
in them can be accurately repaired. Such regions are preserved in modern day 
vertebrate cells as CG islands. In addition, any mutation of a CG sequence in the 
genome that destroyed the function or regulation of a gene in the adult would be 
selected against, and some CG islands are simply the result of a higher than nor- 
mal density of critical CG sequences. 

The mammalian genome contains an estimated 20,000 CG islands. Most of 
the islands mark the 5' ends of transcription units and thus, presumably, of 
genes. The presence of CG islands often provides a convenient way of identify- 
ing genes in the DNA sequences of vertebrate genomes. 

Summary 

The many types of cells in animals and plants are created largely through mecha- 
nisms that cause different genes to be transcribed in different cells. Since many 
specialized animal cells can maintain their unique character through many cell 
division cycles and even when grown in culture, the gene regulatory mechanisms 
involved in creating them must be stable once established and heritable when the 
cell divides. These features endow the cell with a memory of its developmental history. 
Bacteria and yeasts provide unusually accessible model systems in which to study 
gene regulatory mechanisms. One such mechanism involves a competitive interac- 
tion between two gene regulatory proteins, each of which inhibits the synthesis of the 

^ other; this can create a flip-flop switch that switches a cell between two alternative 
patterns of gene expression. Direct or indirect positive feedback loops, which enable 

I gene regulatory proteins to perpetuate their own synthesis, provide a general mech- 

* ■ anism for cell memory. Negative feedback loops with programmed delays form the 

'$. basis for cellular clocks. 

|V In eucaryotes the transcription of a gene is generally controlled by combinations 

J| of gene regulatory proteins. It is thought that each type of cell in a higher eucaryotic 
€ organism contains a specific combination of gene regulatory proteins that ensures 
g the expression of only those genes appropriate to that type of cell A given gene regu- 
latory protein may be active in a variety of circumstances and typically is involved 
in the regulation of many genes. 

In addition to diffusible gene regulatory proteins, inherited states of chromatin 
condensation are also used by eucaryotic cells to regulate gene expression. An espe- 
cially dramatic case is the inactivation of an entire X chromosome in female mam- 
mals. In vertebrates DNA methylation also functions in gene regulation, being used 
mainly as a device to reinforce decisions about gene expression that are made ini- 
tially by other mechanisms. DNA methylation also underlies the phenomenon of 
genomic imprinting in mammals, in which the expression of a gene depends on 
whether it was inherited from the mother or the father. 



| POSTTRANSCRIPTIONAL CONTROLS 

f In principle, every step required for the process of gene expression could be 

£. controlled. Indeed, one can find examples of each type of regulation, although 

i any one gene is likely to use only a few of them. Controls on the initiation of 

f . gene transcription are the predominant form of regulation for most genes. But 

| other controls can act later in the pathway from DNA to protein to modulate 

| the amount of gene product that is made. Although these pbsttranscriptional 

gf ■ controls, which operate after RNA polymerase has bound to the gene's promoter 

I and begun RNA synthesis, are less common than transcriptional control, for 

K niany genes they are crucial. 
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Figure 7-86 A mechanism to explain 
both the marked overall deficiency 
of CG sequences and their clustering 
into CG islands in vertebrate 
genomes. A black line marks the location 
of a CG dinucleotide in the DNA 
sequence, while a red "lollipop" indicates 
the presence of a methyl group on the 
CG dinucleotide. CG sequences that lie in 
regulatory sequences of genes that are 
transcribed in germ cells are unmethyJated 
and therefore tend to be retained in 
evolution. Methylated CG sequences, on 
the ojher hand, tend to be lost through 
deamination of 5-methyl C toT, unless the 
CG sequence is critical for survival. 
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